-
Notifications
You must be signed in to change notification settings - Fork 109
8336845: [lworld] Virtual threads don't support the value class calling convention #1399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: lworld
Are you sure you want to change the base?
Conversation
👋 Welcome back thartmann! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
@TobiHartmann This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply issue a |
Background
The scalarized calling convention will pass value objects by-value by passing their field values instead of a reference. Now this is only done by C2 while the interpreter and C1 still use references to heap "buffers". When calling from the interpreter or C1 to C2 compiled code, we nee to "unpack" from the heap buffer. The difficulty is that unpacking of value type arguments in the nmethod entry point might require additional stack space. For example, let's assume we have a method that takes two value type arguments
v1
andv2
of a type that has four int fieldsf1
-f4
.The register/stack layout after method entry for an un-scalarized call would look like this:
(1)
And for the scalarized call it would look like this:
(2)
In this case, the scalarized calling convention requires two additional stack slots to pass
v2.f3
andv2.f4
because we don't have enough registers. Since we cannot overwrite stack slots in the caller frame, we need to extend the callee frame before unpacking.To convert from (1) to (2), we have a
[Verified Value Entry Point]
that will extend the stack:This is implemented in
MacroAssembler::extend_stack_for_inline_args
:valhalla/src/hotspot/cpu/x86/macroAssembler_x86.cpp
Lines 7028 to 7039 in 12e5dda
In addition, we have a
[Verified Entry Point]
that will be used for scalarized to scalarized calls and will not extend the stack. It still needs to set the stack increment:This is implemented in
C2_MacroAssembler::verified_entry
:valhalla/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
Lines 124 to 128 in 12e5dda
This protocol allows for an efficient conversion between the calling conventions with minimal impact to scalarized code. The only difference is that the epilog that used to have a constant stack increment now has to repair the stack:
This is implemented in
MacroAssembler::remove_frame
:valhalla/src/hotspot/cpu/x86/macroAssembler_x86.cpp
Lines 7271 to 7274 in 12e5dda
Also, stack walking in
frame::safe_for_sender
andframe::sender_for_compiled_frame
now needs to repair the stack to get a correctsender_sp
if the current frame extended the stack. This is done inframe::repair_sender_sp()
by adding thesp_inc
if necessary:valhalla/src/hotspot/cpu/x86/frame_x86.cpp
Lines 702 to 714 in 12e5dda
Deoptimization uses the same stack walking code and is therefore not affected.
This PR
Code for freezing and thawing virtual threads walks stack frames and therefore needs to be aware of C2 compiled frames that have a dynamic size and might need a stack repair.
Thanks,
Tobias
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/valhalla.git pull/1399/head:pull/1399
$ git checkout pull/1399
Update a local copy of the PR:
$ git checkout pull/1399
$ git pull https://git.openjdk.org/valhalla.git pull/1399/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 1399
View PR using the GUI difftool:
$ git pr show -t 1399
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/valhalla/pull/1399.diff